A Discriminative Candidate Generator for String Transformations
نویسندگان
چکیده
String transformation, which maps a source string s into its desirable form t∗, is related to various applications including stemming, lemmatization, and spelling correction. The essential and important step for string transformation is to generate candidates to which the given string s is likely to be transformed. This paper presents a discriminative approach for generating candidate strings. We use substring substitution rules as features and score them using an L1-regularized logistic regression model. We also propose a procedure to generate negative instances that affect the decision boundary of the model. The advantage of this approach is that candidate strings can be enumerated by an efficient algorithm because the processes of string transformation are tractable in the model. We demonstrate the remarkable performance of the proposed method in normalizing inflected words and spelling variations.
منابع مشابه
دو روش تبدیل ویژگی مبتنی بر الگوریتم های ژنتیک برای کاهش خطای دسته بندی ماشین بردار پشتیبان
Discriminative methods are used for increasing pattern recognition and classification accuracy. These methods can be used as discriminant transformations applied to features or they can be used as discriminative learning algorithms for the classifiers. Usually, discriminative transformations criteria are different from the criteria of discriminant classifiers training or their error. In this ...
متن کاملComparative Study of Linear Feature Transformation Techniques for Mandarin Digit String Recognition
Linear feature transformation technique is widely used to improve feature discriminability. It can reduce the dimensionality of the feature space, un-correlate the feature components, hence more discriminative model can be obtained. In this paper we compare three discriminative linear transformation approaches in Mandarin digit string recognition (MDSR) system. Compared with the conventional Li...
متن کاملCryptographic potentials of quasigroup transformations
In this paper we show the potentials of string transformations by quasigroups, as a new paradigm in cryptography. To show that, we describe several algorithms that include a block cipher, a stream cipher, a hash function with variable length of output that is strongly collision free and a nonlinear pseudo random number generator. All those algorithms can be implemented using only several progra...
متن کاملA Calculation of the plane wave string Hamiltonian from N = 4 super-Yang-Mills theory
Berenstein, Maldacena, and Nastase have proposed, as a limit of the strong form of the AdS/CFT correspondence, that string theory in a particular plane wave background is dual to a certain subset of operators in the N = 4 super-Yang-Mills theory. Even though this is a priori a strong/weak coupling duality, the matrix elements of the string theory Hamiltonian, when expressed in gauge theory vari...
متن کاملUnbiased Random Sequences from Quasigroup String Transformations
The need of true random number generators for many purposes (ranging from applications in cryptography and stochastic simulation, to search heuristics and game playing) is increasing every day. Many sources of randomness possess the property of stationarity. However, while a biased die may be a good source of entropy, many applications require input in the form of unbiased bits, rather than bia...
متن کامل